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EOSDIS DADS REQUIREMENTS 
by J. Berbert and B. Kobler 


ABSTRACT 

A brief summary of the EOSDIS Core System (ECS) DADS requirements Is given 
including the ECS relationship to EOSDIS Version-0, phased implementation of ECS 
and data ingest, archive, and distribution daily data volumes anticipated at each o t e 
7 Distributed Active Archive Centers (DAACs). 

EOS GOALS 

The Earth Observing System Data Information System (EOSDIS) Data Archive and 
Distribution System (DADS) is part of the Earth Observing System (EOS) program The 
EOS program goals are given in Fig. 1. In short the goals are to acquire, access, and 
analyze Earth Science data as NASA’s contribution to the Global Change Research 
Program. 

PHASED IMPLEMENTATION 

The full capability of the EOSDIS DADS is built up in a series of steps as indicated in 
Fig. 2. Version-0 (VO) implementation began in 1990 with an estimated data volume ot 5 
Terabytes (TB) and is expected to grow to 33 TB by 1994, 

Version- 1 (VI) and Verston-2 (V2) are part of the separately funded EOSDIS Core System 
(ECS). A request for proposals (RFP) for the 10 year contract to build the ECS was 
released by the Government on July 1, 1991. During VI implementation the ECS 
archive is expected to grow from 10 TB to 40 TB, and the number of active DADS is 
expected to grow from 1 to 7. the 3 DADS at GSFC, Langley, ^Marshal are to be 
operational for the Tropical Rainfall Measurement Mission (TRMM), which is 
scheduled for launch in 1997. The V2 implementation of ECS is primarily to support 
EOS-A1 with its order of magnitude increase in data products the first year and a 
subsequent increase of about 330 TB, or one third Petabyte (PB), per year, thereafter. 

EOSDIS VO 

The contribution anticipated from VO and the specific relationship of VO to ECS are 
given in Fig. 3. It is anticipated that VO will provide significant heritage to ECS through 
prototyping efforts and by working towards interoperability amongst existing data 
systems. 

ECS SEGMENT AND DAACS 

A logical system architecture for ECS is shown in Fig. 4 (taken from the ECS RFP 
Statement of Work (SOW)). The 3 ECS segments shown are the Flight Operations 
Segment (FOS), the Communications and System Management Segment (CSMS), and 
the Science Data Processing Segment (SDPS). The SDPS includes th f, distributed Active 
Archive Centers (DAACs) and the Information Management System (IMS). A DAAC 
includes a Product Generation System (PGS) with a collocated DADS and a distributed 

part of the IMS. 

DAAC LOCATIONS 

The locations of the 7 DAACs are shown on the Fig. 5 map. They are at Goddard Space 
Flight Center (GSFC), Jet Propulsion Laboratory (JPL), EROS Data Center (EDC), 

Langley Research Center (LaRC), National Snow and Ice Data Center (NSIDC), Alaska 
SAR Facility (ASF), and Marshall Flight Center (MSFC). 
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DADS FUNCTIONS AND REQUIREMENTS 

5 major DADS functions, namely Ingest, Archive, Process Orders, Manage System 
and Distribution, are given in more detail in Fig. 6, along with some of the key 
performance requirements for ingest and distribution. A key performance 
requirement for ingest is to be capable of accepting Level-0 (LO) data from the Customer 
^f^K^f. ratl ° nS py stem (CDOS) at a high data rate. Key performance requirements for 
Distribution are to provide data products ready for network distribution within an 
average of 5 minutes of receipt of product order, and ready for physicalmedia 
distribution within 24 hours of receipt of product order. Also, the capabilities for both 
network and media daily distribution rate must be equivalent to daily ingest rate. 

DADS INTERFACE 

Fig. 7 is the Conceptual DADS Context Diagram taken form the RFP Requirements 
Specification. This illustrates the multitude of data exchange Interfaces for DADS data 
Ingest and distribution. Some entities on this diagram not previously identified are the 
fUriliated Data Centers (ADCs), Other Data Centers (ODCs), Earth ProL Data Systems 
(EPDSs), Science Computing Facilities (SCFs), and International Partners (IPs). 

DATA VOLUMES PER LEVEL 

In Fig. 8, the total DADS daily data volumes, for the platforms EOS-A1 TRMM and 
SAR are given for the data processing levels, LO, LI A, LIB, L2, L3, and L4. SAR is the 
Synthetic Aperture Radar (SAR) platform, which is a separately funded option on 
the ECS contract. These daily data volumes are taken from the ECS Requirements 
Specification, Appendix C. As can be seen from this figure, the amount of data to be 
ingested, archived, managed, and distributed expands significantly from the amount of 
LO data received from CDOS. For this set of platforms, the expansion factor is 3 6 

The total daily data contribution from TRMM is 18 GB/day, or 6.6 TB/year which is 
smaH compared to EOS-A1, but large enough to fill 6 StorageTek Near-Line Library 
Units (Shos) per year, each Silo containing 6000 3480 type cartridges. Moreover the 
*° contri ^ u t ion f orm EOS-A1 is 895 GB/day, or about 50 times the TRMM 
contribution. SAR adds 591 GB/ day, or about 33 times the TRMM contribution. 

DATA VOLUME PER DAAC 

In Fig. 9, the total DADS daily data volumes, for the same 3 platforms, are given for 
each of the 7 DAACs. The 5 DAACs at JPL, LaRC, NSIDC. ASF. and MSFC vary In size 
Irom 3.5 to 8.3 GB/day, which is equivalent to 1.3 to 3.0 TB/year. or 20 to 45 TB in the 15 
year EOS data collection period. Thus, these 5 DAACs could be called Tera-DAACs. 

The other 2 DAACs at GSFC and EDC are roughly 2 orders of magnitude larger, and with 
EOS-A alone, each grows to a size of 2 to 3 PB over the 15 year EOS-A lifetime thereby 
qualifying as Peta-DAACs. ’ y 


It should be noted that the data volumes given in Figs. 8 and 9 do not include additional 
data volume required due to backup of hard-to-replace data products and due to 
reprocessing of selected data sets. However, this is partially offset by the fact that CDOS 
provides the disaster backup for LO data, so that it is necessary for the DAACs to archive 
LO data for only a year. 

MEDIA REQUIREMENTS AT GSFC FOR EOS-A 1 

In Fig. 10, the daily data volume of 489 GB/day at the GSFC DAAC for the EOS-A1 
platform is converted into a daily media requirement for several types of physical 
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media, ignoring utilization efficiency. For 3480 type cartridges containing 200 MB of 
data, this translates into 2445 cartridges per day, enough to fill a 6000 cartridge 
StorageTek silo every 2.5 days. With the newer 3490 cartridges, having double the data 
density, it takes 5 days to fill the silo. The D1 and D2 tape technologies reduce the daily 
number of cartridges required by about 2 orders of magnitude. It is anticipated that 
technological progress toward higher density data recording will continue over the next 
7 years prior to EOS-A1 launch, resulting in a physically smaller and more manageable 
archive at that time than would be possible with current technology. 

TRANSFER RATES 

A potential bottleneck for timely EOSDIS DADS operations, is in the available data 
transfer rates for read/write devices compatible with available storage media. Data 
transfer rates available for drives compatible with the types of media considered in Fig. 
10 are given in Fig. 1 1. With a transfer rate of 3.0 MBps, as is available for 3490 type 
cartridges, a single Image file of 327 MB requires 109 seconds to physical read, again 
Ignoring efficiency factors. Technological progress toward faster data transfer rates 
may be achieved prior to EOS-A1, but progress in this area has not been as rapid as In 
the area of higher density data recording. 
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EOS (EARTH OBSERVING SYS+EM) GOALS ARE TO 
DEVELOP AND OPERATE: 


a) An observing system to acquire essential, global Earth 
science data on a long-term, sustained basis and in a 
manner which maximizes the scientific utility of the data and 
simplifies data analysis. 


b) A comprehensive data and information system to provide 
the Earth science research community with easy, affordable, 
and reliable access to the full suite of Earth science data 
from EOS and international partner observatories, NASA 
Earth Probes, and selected Earth science data from other 
sources. 


c) As the cornerstone of the Mission to Planet Earth Global 
Change Research Program, an Integrated scientific research 
program to investigate processes in the Earth System and 
Improve predictive models. 

i.e. TO ACQUIRE, ACCESS, ANALYZE EARTH SCIENCE DATA 


Ftgl 


PHASED IMPLEMENTATION 
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EOSDIS VO 


VO TO PROVIDE: 

o Interconnection of existing data systems at DAACs 

o Prototyping of selected tasks in distributed IMS, 
networking, standards 

o Some additional Earth Observation data sets to be 
added to the existing Data Systems under VO 


VO RELATIONSHIP TO ECS 

o Provides early experience/information/results for 
potential inclusion in ECS design 

o ECS contractor to connect ECS to VO and provide a 
level of interoperability 

o Selected data sets from VO to be copied for inclusion 
into ECS 


Fig 3 


EOS 



I 


Shaded areas are 
ECS responsibility 


Figure 1. 4. 1 - I. LCS Logical System Architecture 

Fig 4 
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DADS FUNCTIONS 


INGEST- Receive/Validate data products and data from CDOS , 
PGS, SCFs, other DAACs, ADCs, ODCs, EPDSs, IPs, Users, 
and others 


ARCHIVE- Store data and data products on archive media 


PROCESS ORDERS - Fulfill product orders provided by IMS, 
Retrieve data from archive, subset, reformat, stage for 
delivery. Support reprocessing 


MANAGE SYSTEM - Monitor and report status and accounting 
information to SMC, operate File Storage Management System 
with hierarchical archive, schedule operations according to 
SMC directives, monitor media BER ( 10**- 12) and provide for 
data restoration/mlgratlpn, backup selected data 


DISTRIBUTION - Distribute data and data products to PGS. 
SCFs, other DAACs, ADCs, ODCs, EPDSs, IPs, Users, and 
others via networks (5 minutes) and by Physical media (24 
hours). Provide daily distribution rate capability equivalent to 
daily ingest rate 
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